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; ABSTRACT 

^ ■ We report new constraints on the local escape speed of our Galaxy. Our analysis is based on 

' a sample of high velocity stars from the RAVE survey and two previously published datasets. 

, We use cosmological simulations of disk galaxy formation to motivate our assumptions on the 

■ shape of the velocity distribution, allowing for a significantly more precise measurement of 
■ ■ ■ ' the escape velocity compared to previous studies. We find that the escape velocity lies within 

the range 498 km s ' < Vesc < 608 km s ' (90 per cent confidence), with a median likelihood 
of 544 km s"'. The fact that x'l^^ is significantly greater than 2v^.^^^ (where Vcirc = 220 km s"' is 
the local circular velocity) implies that there must be a significant amount of mass exterior to 
the Solar circle, i.e. this convincingly demonstrates the presence of a dark halo in the Galaxy. 
We use our constraints on Vesc to determine the mass of the Milky Way halo for three halo 
profiles. For example, an adiabatically contracted NFW halo model results in a virial mass of 
1.42^g^^ X 10'~Mo and virial radius of 305 kpc (90 per cent confidence). For this model 
the circular velocity at the virial radius is 142;^,[ km s"'. Although our halo masses are model 
dependent, we find that they are in good agreement with each other. 

Key words: Galaxy: kinematics and dynamics. Galaxy: fundamental parameters 



1 INTRODUCTION 

The existence of a dark halo around the Milky Way has been known 
for many years, although its nature is still uncertain. The mass and 
extent of this halo is a significant issue in astronomy. One vital tool 
which can be used to tackle this problem is also one of the simplest 
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- the escape speed. If we are able to determine the escape speed 
at the solar neighbourhood, i.e. the velocity that a star requires to 
escape the local gravitational field of the Milky Way, then this can 
provide important constraints on the extent of the dark halo. The 
reason why this quantity is so important is because it is the only 
local dynamical measurement that can be used to probe the extent 
of the mass distribution outside the solar circle. Unlike the circular 
velocity, which depends primarily on the mass interior to the so- 



2 Smith et al. 



lar circle, the escape velocity contains information about the mass 
exterior to the solar circle. Although one needs a model for this 
mass distribution, the escape velocity (i.e. the local gravitational 
potential) can be used as a constraint from which it is possible to 
determine the total mass. 

It is possible to use more distant measurements to investigate 
the extent of the Galactic halo. Unfortunately, gas rotation curves 
cannot be traced beyond the extent of gas in circular orbits, ~ 20 
kpc for the Milky Way. The task of tracing the rotation curve is also 
complicated by the fact that velocities have to be accompanied by 
distances (Binney & Dehnen 1997) and, in any case, our Galaxy 
does not appear to have an extended HI disk. As a consequence, 
most methods of probing the halo rely on satellites and globu- 
lar clusters, whose velocities can be measured out to significantly 
greater Galactocentric distances. Many authors have used the ve- 
locities of the Milky Way's satellite galaxies and globular clusters 
in an attempt to constrain the total halo mass. Although numerous 
papers have dealt with this subject (Little & Tremaine 1987; Zarit- 
sky et al. 1989; Kulessa & Lynden-Bell 1992; Kochanek 1996), two 
of the more recent ones to exploit the motions of satellite galaxies 
and globular clusters have concluded the total mass of the halo to 
be around 2 x 10'^ Mq ; Wilkinson & Evans (1999) found a halo 
mass of ~ 1-9^]"^ x lO'^ M© by adopting a halo model which pro- 
duces a flat rotation curve that is truncated beyond an outer edge; 
whereas Sakamoto, Chiba & Beers (2003), using a halo potential 
that gives a flat rotation curve, also included the velocities of field 
horizontal-branch stars to find a total halo mass of 2.5^, ^ x 10'* Mq 
or I.8^Q7 X 10'' Mq, depending on whether or not the analysis in- 
cludes Leo I. Another complementary approach that can be adopted 
is to analyse the radial velocity dispersion profile of halo objects; 
Battaglia et al. (2005; 2006) used this method to determine a total 
mass of 0.5 - 1.5 x 10'^ Mq depending on the chosen model for the 
halo profile (see also Dehnen, McLaughlin & Sachania (2006) for a 
reanalysis of this dataset). The future for this field looks promising 
with space missions such as Gaia (due for launch 2011; Ferryman 
et al. 2001; Wilkinson et al. 2006) and Space Interferometery Mis- 
sion (due for launch ~ 2015; Allen, Shao & Peterson 1998), since 
such missions will be able to provide accurate proper motion infor- 
mation to complement the existing radial velocity measurements; 
with such high quality data it should be possible to determine the 
mass of the Milky Way to ~ 20 per cent (Wilkinson & Evans 1999). 

One can see that the current results mentioned above still pro- 
duce a factor of ~ 2 uncertainty in the mass of the Milky Way, due 
to the fact that the results are still model dependent and are hin- 
dered by small number statistics concerning the relevant datasets. 
Therefore it would be very valuable if one could provide tight con- 
straints on the local escape velocity in order to pin down the grav- 
itational potential at this point. As far back as the 1920s samples 
of high velocity stars were being used to estimate the local escape 
velocity (for example, Oort 1926; Oort 1928). As the 20th century 
progressed, many papers refined the estimate of Vesc (see Fich & 
Tremaine [1991] for a review), culminating in the final decade with 
the seminal work of Leonard & Tremaine (1990, hereafter LT90) 
and the subsequent refinement by Kochanek (1996, hereafter K96). 
These two papers concluded that, to 90 per cent confidence, the es- 
cape velocity lies in the range 450 km s"' < Vesc < 650 km s"' and 
489 kms"' < Ve^c < 730 kms"', respectively. Their conclusions are 
hampered by several problems: firstly, the paucity of high velocity 
stars from which to estimate Ves^; secondly the fact that biases were 
introduced by selecting high velocity stars from proper-motion sur- 
veys; and thirdly the uncertainty in the assumptions regarding the 
underlying form of the tail of the velocity distribution. In this new 



century the difficulties posed by the first two issues are to some 
extent diminishing due to the large kinematically unbiased surveys 
that are now underway or planned, such as RAdial Velocity Exper- 
iment (RAVE; Steinmetz et al. 2006; see also Section l4Tt . Sloan 
Extension for Galactic Understanding and Exploration (SEGUE; 
Beers et al. 2004) and Gaia (Ferryman et al. 2001). The latter prob- 
lem can be tackled through various methods; one such approach 
could be to use predictions from cosmological simulations to esti- 
mate the form of the velocity distributions. In this paper we shall 
make use of the advancement afforded to us by the RAVE survey, 
combined with the analysis of cosmological simulations, to refine 
the determination of I'esc- 

The outline of this paper is as follows. In Section|2]we review 
the analytical techniques that have been developed to constrain the 
escape velocity from a sample of velocity measurements. Then in 
the following section we assess various aspects of these techniques 
using cosmological simulations. In particular we use the simula- 
tions to estimate the expected shape of the tail of the velocity distri- 
bution, which is a crucial ingredient in the escape velocity analysis. 
In Section |4] we present the data that we will use to constrain the 
escape velocity and undertake some tests to ensure that these data 
are reliable. Our new data come from the RAVE project (Steinmetz 
et al. 2006), but are augmented with archival data from published 
surveys. In Section |5] we present our results and in Section [6] we 
consider some of the issues arising from or concerning these re- 
sults; in particular, this latter section discusses the nature of our 
high velocity stars (Section [6. It , the effect of the sample volume 
on the recovery of the escape velocity J6.2t . the possible contami- 
nation from unbound stars \63\ and also uses our new constraints 
on Vesc to investigate the total mass of the Galactic halo l |6.4| l. In 
Section|7]we conclude our paper with a brief summary. 



2 ANALYSIS TECHNIQUES 
2.1 Likelihood 

The techniques that we apply in order to constrain the escape veloc- 
ity (I'csc) are based on those established by LT90. They parametrize 
the distribution of stellar velocities around Vesc according to the fol- 
lowing formula, 

/(|v||Ve,„^)cc(Ve,,-|v|)*, |v|< Ve.c (D 
,A|v||Ve,„^) = 0, |v|>Ve,,, (2) 

where |v| is the speed of the star and k describes the shape of the 
velocity distribution near Vesc. Note that this approach is only valid 
if the stellar velocities do indeed extend all the way to Vesc . If there 
is any truncation in the velocities then this approach will underes- 
timate the true v^^c ■ 

Under the assumption that the Jeans theorem can be applied 
to the the system. Equation l[T) can be understood by considering 
the distribution function for the energies of the stars, e. Assuming 
there is no anisotropy in the velocities, we can write the asymptotic 
form of the distribution function as a power-law (K96), 

/(6)oce*, where e = -((J -I- |v|V2), (3) 

where O corresponds to the potential energy and |v|^/2 to the ki- 
netic energy. Again k describes the shape of the velocity distribu- 
tion near Vesc. Clearly C) = -v^^jl, which results in the following 
simple form for the asymptotic behaviour of the velocity distribu- 
tion, 

f{\y\ I Vesc, k) OC (Ve^sc " Wl'f = [(Vesc - lvl)(Ve.sc + Iv])]*^. (4) 



The RAVE Survey: Constraining the Local Galactic Escape Speed 3 



It could be argued that the velocity distribution near v^^c will 
not follow the form given in equation (|4j since the orbital peri- 
ods of such high velocity stars can be comparable to or larger than 
the age of the system (hence invalidating Jeans Theorem). In this 
case the velocity distribution near Vesc would be a superposition of 
a small number of streams. This is likely to be an important is- 
sue only in the tail of the velocity distribution (i.e. for stars with 
V > 3cr, where tr is the velocity dispersion of the system, see Helmi, 
White & Springel [2002]). With this important caveat, in this paper 
we shall follow the traditional assumptions reflected in the above 
equations, and highlight possible limitations where appropriate. 

Intuitively we expect that > 0, in which case /(|v|) — > 
as |v| — > Vesc- However this is not a necessary condition provided 
one accepts the presence of a discontinuity at /(|v| = Vesc). The 
likelihood of such distributions can be observationally constrained 
(as will be shown in Section [5] we find that values of < are 
strongly disfavoured). 

Equation Q can be further simplified in the limit of (|v| — > 
Vesc) to l[T) by neglecting the (Vesc -i- |v|)* term, which has a strong 
systematic variation with k. However, throughout this paper, unless 
explicitly stated otherwise, we adopt l|4]l. 

In LT90, it was argued that radial velocities alone provided 
the most reliable constraints on Vesc, since tangential velocities are 
much more uncertain due to inaccuracies in measuring proper mo- 
tions and estimating distances. For example, a proper motion sur- 
vey with typical errors of ~ 10 per cent in both proper motion and 
distance would result in an error of ~ 60 km s"' for a star with 
velocity of 400 km s"', whereas radial velocities can be measured 
with an accuracy of typically less than a few kms"' (see Section 
O. In addition, since our work is motivated by the current advances 
in radial velocity surveys, we shall not incorporate the tangential 
velocities into our analysis. 

To apply equation (|4]l to a sample of radial velocities, we must 
average over the unknown tangential velocities. 



/(vesc, fc) for the unknown quantities to be estimated can be defined 
as: 



/r(v,|vesc, k) = /(|v| | v^,^,k)S(v, - v.n)dv. 



(5) 



where n is the unit vector along the line-of-sight. Clearly, unless 
the lines-of-sight are isotropically distributed, this equation is only 
valid for an isotropic distribution function. For our dataset (the 
RAVE survey) we do not have all-sky coverage as RAVE only mon- 
itors the Southern hemisphere. However, even if the distribution 
function is anisotropic, this equation is still valid for a half-sky sur- 
vey provided the mean motion is small; as we shall see in Section 
13.21 we are dealing almost exclusively with halo stars whose mean 
motion is indeed small. Although the RAVE survey does not cover 
the entire Southern sky, any corresponding bias should be negligi- 
ble compared to the relatively large statistical uncertainties in our 
final result. In any case, if the distribution function is assumed to be 
isotropic then the issue of sky coverage is immaterial since equa- 
tion ^ is then valid for any sky coverage. 

Evaluating equation lO for the LT90 formalism (equation [T]!, 
we obtain, 



/■(VrlVesc,*:) 0^ (Vesc - V,)* 



(6) 



For the formalism given in equation we integrate using cylin- 
drical polar coordinates to obtain. 



/r(Vr|Vesc,A:)cc(v;se-v2)*-^'. 



(7) 



I (Vesc, ^) = ]~~[ /r (Vr,,|Vesc, ■ 



(8) 



where /r(i'r,;|vesc, k) is given by either equation l[6ll or l|7j and i'^., are 
the radial velocities of the individual n stars. 

It can also be useful and sometimes more robust (especially 
for small samples) to apply prior knowledge about I'esc and k by 
way of Bayes' Theorem. Given Vr,, radial velocity observations, the 
probability of finding v^^c and k in the ranges Vesc to I'esc + rfvesc and 
ktok + dk, respectively, is given as: 



^(Vesc,fc|Vr.,= l.....„) ^ 



P(Vesc)f(fc)H;.LiP(v,,|Vesc,^) 

/ / ^(Vesc)'P(^')n;.L,P(v„|v;,„^')dv^scd^'' 



(9) 



In order to constrain v^^c and k for a sample of n stars, we 
employ the maximum likelihood method. The likelihood function 



P (iv.i h'cso fc) is proportional to /r(vj_,|i'esc> '^)- /'(^esc) and P{k) are the 
a priori probabilities (i.e., prior knowledge) of Vesc and k, respec- 
tively. It is this equation, known as the posterior distribution, with 
chosen reference priors /'(vesc) and P(k), that will be maximized to 
determine I'esc and k. 

In general the form of the distribution function given above 
(equation[3) is only true asymptotically as v — > Vesc, and so to eval- 
uate this probability we must impose a lower limit (v„^i„) on the 
magnitude of the radial velocities that we will consider in our anal- 
ysis; the choice of v^in is investigated further in Section lT2l As will 
be shown later in Section|4] this form provides a good fit to the data 
over our chosen range of velocities. 

Once the distributions given in equations l[6) & Q have been 
normalized such that J "'^ /"(VrlVesc, fc)dVr = 1 we can then evaluate 
equation One important factor in the evaluation of equation ^ 
is the choice of apriori probability; this is discussed in Section l231 



2.2 The bootstrap 

A short-coming of the method described above is that it assumes 
that the distribution of stellar velocities is in equilibrium and 
steady-state. However, there are many potential limitations to this 
assumption, for example velocity substructure, binary systems for 
which the centre-of-mass velocity has not been measured accu- 
rately or non-equilibrium stars such as those ejected from binary 
systems (including hyper-velocity stars, which are plausibly ejected 
from the central regions of the Galaxy after interaction with the 
super-massive black hole at the Galactic Centre; e.g. Brown et al. 
[2006]). All of these mechanisms for potential contamination can 
statistically perturb our underlying distribution function. They can 
change how the overall observed distribution function is populated, 
(e.g. fluctuations due to the addition of some orbital velocity with 
amplitude dependent on phase and inclination of the binary orbit). 
However, these mechanisms are not well modelled, which makes it 
difficult to accommodate these effects. It is crucial therefore to em- 
ploy a bootstrap method, which performs 'resampling' on our orig- 
inal data-set to assess the sensitivity to possible non-equilibrium 
stellar velocities. 

In brief, the bootstrap method involves randomly resampling 
the original dataset (with replacement) to create artificial 'boot- 
strap' samples. Each bootstrap (re)sample provides one value each 
for k and Vesc that has maximum likelihood for that (re)sample of 
the data. This is repeated a large number of times (typically 5000), 
and the distribution of these maximum likelihood pairs is defined as 
the 'bootstrap distribution' . The bootstrap distribution can be used 
to estimate the sampling distribution of the maximum likelihood 
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values for k and v^sc, e.g. as obtained in equation (|9}. Thiis is ex- 
tremely useful, because the fact we have a small sample means we 
cannot necessarily rely on maximum likelihood estimators having 
converged to normality. Therefore, we can use the bootstrap distri- 
bution as a means to model the estimator sampling distributions. 

Typically the bootstrap distribution is approximately normal, 
which allows us to rely less on hope that the original sample size is 
large enough for the central limit theorem to apply. The bootstrap 
distribution has bias if its mean values for k and Vesc are not the 
same as those found for the original sample. Bias and skewness in 
the bootstrap distribution are statistical signatures that can give us 
a general understanding of possible fluctuations due to any 'con- 
taminating' velocities, as explained above. These signatures can 
be studied using standard techniques (Davison & Hinkley 1997). 
However, if the bootstrap distribution significantly deviates from 
a normal distribution, steps must be taken to model the distribu- 
tion. The bootstrap standard error is the standard deviation of the 
bootstrap distribution and the 90th-percentile confidence intervals 
correspond to the regions containing 90 per cent of the maximum 
likelihood pairs {k, v^^)- This is useful, because now the bootstrap 
estimates of bias, standard enw and confidence interval endpoints 
are random variables. Their variances can be reduced by increas- 
ing the number of bootstrap samples. However, the quality of the 
bootstrap approximation still depends on the original sample size. 

2.3 Considering tlie a priori probability for k and Vcsc 

To evaluate equation l|9) we must first choose the form of the pri- 
ors for k and Vesc. In previous work, the prior for Vesc was chosen 
to be P(Vesc) l/i'esc, siucc this is appropriate for a variable that 
varies continuously from to infinity (Kendall & Stuart 1977). 
However, the choice of the prior for k requires more thought. In 
Section im we expressed the velocity distribution near the escape 
velocity assuming that the asymptotic behaviour follows a power 
law (equations |3}@ll. Unfortunately, the range of feasible values 
that one would expect for this exponent k is uncertain. Previous 
work has shown that to obtain meaningful limits on Vesc from a sin- 
gle maximum likelihood analysis of a solar neighbourhood sample 
of stars one must assume some prior for k, since only very large 
samples of stars would allow one to constrain both k and v^^^ si- 
multaneously. LT90 used Monte Carlo simulations to estimate that 
samples of ~ 200 stars with accurate radial velocities will be re- 
quired to constrain simultaneously k and v^^^ for I'^in = 250 km s"' 
using the formalism given in equation 

For a given distribution function it is possible to predict the 
behaviour of k. For example, a Plummer model (1911) has the dis- 
tribution function, /(e) oc e^^, i.e. k = 3.5. Similarly, to first order 
approximation a Hernquist (1990) model has a distribution func- 
tion with k = 2.5. However, LT90 argued that for a sample of 
stars free from selection effects, k should lie between 1 and 2. They 
noted that an isolated system that has undergone violent relaxation 
should have k = 1.5 (Aguilar & White 1986; Jafi'e 1987; Tremaine 
1987) and a coUisionally relaxed system, such as a globular cluster, 
should have k = I (Spitzer & Shapiro 1972). K96 chose to as- 
sume a uniform prior onk e [0.5, 2.5], which brackets this violent- 
relaxation value. Another approach one can employ to understand 
the nature of k is to use cosmological simulations; we implement 
this approach in Section lJTI 

Unlike previous work, we will also apply the bootstrap tech- 
nique (Section |2.2t to ascertain whether our results are sensitive 
to possible problems with the data and allow confidence estima- 
tion for the maximum likelihood values of v^x and k. Since we 



adopt bootstrap resampling we cannot simply adopt the reference 
priors chosen by LT90, without further investigation. In Appendix 
|A]we investigate the choice of prior by applying the bootstrap anal- 
ysis to simulated datasets. From this work we conclude that a prior 
P(k) oc yficKk + 2), P(Vcsc) o: l/(Vesc - Vmin) IS the optimal choice, 
although for comparison we also evaluate the bootstrap constraints 
using the classical LT90 choice, i.e. P(k} oc 1, P(Vtsc) l/i'csc- 



3 EXPLOITING COSMOLOGICAL SIMULATIONS TO 
ASSESS ANALYSIS TECHNIQUES 

In this section we assess the reliability of the above techniques in 
recovering the escape velocity. We do this by analysing a suite of 
four cosmological simulations. The first subsection deals with plac- 
ing a priori limits on the possible values of the exponent k, which 
parametrizes the shape of the velocity distribution, while the second 
subsection deals with the effect of thin- and thick-disc contamina- 
tion on the recovery of Vesc and the choice of Vnii„. 

3.1 The shape of the velocity distribution and its relation to 
the parameter k 

In Section [231 we discussed the nature of the a priori information 
that we can incorporate into the evaluation of the likelihood func- 
tion (equation |9]l. We explained the motivation behind the choice of 
prior in K96, i.e. a uniform prior on k e [0.5,2.5]. However, with 
cosmological simulations it is possible to re-assess this choice. It 
was also noted in LT90 that if we have a system where the stel- 
lar velocities do not extend up to Vesc, the statistical arguments of 
Section im will always underestimate the true v^sc- So, in order to 
evaluate the ability of the above statistical method to recover v^sc, it 
is vital that we understand the distribution of stellar velocities. 

To test these issues we will analyse a set of four galaxies from 
cosmological simulations, labelled KIA1-KIA4 (Abadi, Navarro & 
Steinmetz 2006). These galaxies have been introduced in earlier pa- 
pers and we advise interested readers to consult the following refer- 
ences for details regarding the code and the numerical setup: Stein- 
metz & Navarro (2002), Abadi et al. (2003a,b), Meza et al. (2003, 
2005). These simulations, which are carried out in a ACDM uni- 
verse, include the gravitational effects of dark matter, gas and stars, 
and follow the hydrodynamical evolution of the gaseous component 
using the Smooth Particle Hydrodynamics technique (Steinmetz 
1996). The following cosmological parameters were adopted for 
the ACDM scenario: Ho = 65 km s"' Mpc"', erg = 0.9, Qa = 0.7, 
^^CDM = 0.255, flbai = 0.045, with no tilt in the primordial power 
spectrum. All simulations started at redshift Zi„it = 50, have force 
resolution of order 1 kpc, and mass resolution so that each galaxy 
is represented, at z = 0, with ~ 125, 000 star particles. The range 
of masses of the stellar particles in these four simulations at z = 
is 10' - IO'Mq. Readers who wish to find a general overview of 
these simulations are recommended to consult section 2 of Abadi 
et al. (2006), while a more detailed description can be found in the 
references given above. 

There is various evidence that demonstrates the reliability of 
these simulations. Section 3.2 of Abadi et al. (2003a) discusses 
the photometric properties. Here they note the good agreement be- 
tween their surface brightness profile and a superposition of an 
i?'''* spheroid plus exponential disc. The issue of the outer parts of 
the halos of these simulated galaxies are discussed in Abadi et al. 
(2006). The mass distribution in these outer parts is seen to be con- 
sistent with the distributions of Galactic and M31 globular clusters. 
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Name Rescaling Myi, ('vi, iw 

Factor (IO'^Mq) (kpc) (kms"') 



KIAl 


1.30 


120.59 


230.73 


581.06 


KIA2 


0.88 


127.64 


235.15 


640.17 


KIA3 


0.87 


129.87 


236.50 


652.62 


KIA4 


1.15 


126.87 


234.67 


582.92 



Table 1. The properties of our simulated galaxies used in Section ITTI The 
second column shows the rescaling factor that has been applied, which 
is chosen so as to enforce the circular velocity at the virial radius to be 
140 km s"'. The second and third columns refer to the virial mass and ra- 
dius of the rescaled galaxies, respectively. The final column gives the escape 
velocity at the solar radius (8.5 kpc), as calculated according to the prescrip- 
tion described in Section lxT] 

In addition, the surface brightness in these outer regions is well fit 
by a Sersic law and is consistent with observations of both the ha- 
los of M31 and the Milky Way. Support that the kinematics of these 
halos are reliable can be drawn from the fact that they are consistent 
with the most recent observations of halo tracers (e.g. Battaglia et 
al. 2005, 2006). 

In order to allow for a fair comparison between these four 
galaxies and the Milky Way, we rescaled each of them so that their 
circular velocity at their virial radiuj] is equal to 140 km s"' . This 
value of 140 km s"' is a typical value one could expect from us- 
ing an NFW profile to extrapolate the circular velocity at the solar 
neighbourhood, ~ 220 km/s, to the virial radius. This results in a 
small rescaling factor (y) of between 0.9-1.3, where distances, ve- 
locities and masses are rescaled by 1 /y, I /y and I /y' , respectively. 
The properties of our rescaled galaxies are given in Table[T] 

For these simulated galaxies we need to estimate the escape 
velocity at the solar radius (which, for these simulations, we take 
to be 8.5 kpc from the centre of the model galaxy ). To do this we 
define the escape velocity as the velocity required to get to 3rvir, i.e. 
we define the zero of the gravitational potential to be at this radius. 
Although this choice of 3rvir is somewhat arbitrary, it is comparable 
to the separation of the Milky Way and its nearest massive neigh- 
bour M3 1 . We then determine the potential at the solar radius to de- 
duce the escape velocity. The resulting values for between 
581 and 653 kms"' (see Table[T). The stellar particles contribute a 
significant amount of mass, namely between 60 and 70 per cent of 
the total mass interior to the solar circle. 

Given these four simulated galaxies, the first question which 
needs to be addressed is whether the velocity distribution of the 
stars differs from that of the dark matter particles, and in particular 
whether the stellar velocity distribution extends all the way to I'esc- 
The stellar distribution must differ from that of the dark-matter par- 
ticles since the two populations have different density distributions. 

We investigate this issue by plotting the velocity distributions 
for the stellar and dark matter components in our four galaxies. 
Since the disc component of these simulations do not always pro- 
vide a wholly accurate representation of the Milky Way thin and 
thick discs, we only plot the distribution of stellar particles defined 
as belonging to the 'spheroid' population, according to the decom- 
position of Abadi et al. (2003b). This decomposition is based on 
the particles' orbital parameters and results in three populations: a 
halo with little net rotation, a thin disc with stars on nearly circu- 

' We define the virial radius as the radius at which the mean density is 200 
times the critical density for closure (pcr = 3H^/SnG » 7.9 X 10"^' kgm"' 
for// = 65kms-'Mpc"'). 
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Figure 1. This figure shows the velocity distribution of the stars (dashed) 
and dark matter particles (solid) from our four simulated galaxies. These 
distributions are for particles whose distance from the centre of the galaxy 
is between 3 and 14 kpc, and the stellar particles are those belonging to the 
'spheroid' population. 

lar orbits and a thick disc with properties intermediate between the 
other two components. The halo component comprises between 10 
and 40 per cent of the total number of stellar particles inside the 
virial radius. 

Our exclusion of the disc populations from these simulations 
can be further justified when one considers that we expect our high 
velocity sample of stars to contain only limited contamination from 
the Milky Way thick disc and negligible contamination from the 
thin disc (see Section [J!2] l. These velocity distributions are calcu- 
lated for particles whose distance from the centre of the galaxy is 
between 3 and 14 kpc. Since the escape velocity varies over this 
large range of radii, we rescale each velocity by the mean escape 
velocity at that radius. The resulting distributions are shown in Fig. 
[T] It can easily be seen from this figure that the velocity distribu- 
tions differ between the stellar and dark matter components, with 
the stellar particles' velocities concentrated towards smaller val- 
ues. This is due to the difference between the density profiles of 
the stellar halo and dark components, since the stellar particles are 
more concentrated and hence have smaller velocities. At the solar 
radius, the respective density distributions have power-law slopes 

of 2 for the dark matter and 3 for the stellar halo, which is 

consistent with what is found for our Galaxy (e.g. Morrison et al. 
2000). Despite the fact that the stellar velocities are smaller, it is 
encouraging to see that these velocities do indeed extend all almost 
the way to v^^c- This is vital if we are to use this statistical anal- 
ysis described in Section |2] to determine Vesc for the Milky Way. 
Although the decline in the velocity distribution for values close to 
I'esc appears to be sharper for the stellar particles, all of our galaxies 
have stellar particles with velocities greater than 0.9vesc. As a con- 
sequence, even if there is a truncation beyond 0.9vesc, the effect on 
Vesc cannot be more than 10 per cent. 

Since these simulations indicate that the velocity distribution 
of stars clearly differs from that of the dark matter particles, one 
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Figure 2. This figure shows the likelihood estimates for the exponent k, 
which denotes the shape of the velocity distribution. This is shown for both 
the stellar (thick) and dark matter (thin) high velocity 'solar neighbourhood' 
samples. The solid, dotted, short dashed and long dashed lines represent 
simulated galaxies KIAl, KIA2, KIA3 and KIA4, respectively. 



needs to ask how this affects the exponent k. To investigate this we 
evaluate the likehhood function (equation |9) for samples of stellar 
and dark particles, but keeping Vesc fixed at the true value calculated 
above, i.e.. 



P(kKi=i „) ^ 



p(mi,p(vjk) 

/ F{k')ni^P(vJk')dk'' 



(10) 



where we have assumed a uniform prior on k. 

To evaluate equation dlOt we need to construct a sample of 
particles equivalent to our observed solar neighbourhood stars. This 
is done by calculating the radial velocities of particles contained 
within a series (in azimuth around the galaxy) of non-overlapping 
spheres of radius 2 kpc located at a distance of 8.5 kpc from the 
Galactic centre. These spheres are all chosen to lie in the plane of 
the Galaxy. Since the presence of significant mean orbital rotation 
in our sample of stars will invalidate the assumptions made in Sec- 
tion|2] we only use the stellar particles defined as belonging to the 
non-rotating 'spheroid' population, according to the decomposition 
of Abadi et al. (2003b). In order to increase our statistics we also 
include spheres of radius 1 and 3 kpc located at radii 4.25 and 12.75 
kpc; we ensured that there is no trend between the exponent k and 
the radial location of the spheres within our range of values. To 
combine spheres located at different radii, the velocities of parti- 
cles within each sphere are rescaled by the escape velocity at the 
centre of the sphere. 

By evaluating equation l IlOl l we are able to deduce the like- 
lihood estimates for the value of k for the stellar and dark matter 
populations for each of our four simulated galaxies. The resulting 
likelihood distributions of k are shown in Fig. (2] 

As can be seen from this figure, the value of k significantly 
diff^ers for the two populations. For these simulated galaxies we can 
see that the dark matter particles appear to match the predictions 
from the literature, i.e. k e [0.5,2.5] (see above). However, the 



stellar sample does not appear to match this expectation; for these 
k is shifted towards significantly larger values. Since all attempts 
to constrain Vesc must be based on stellar samples, it means that a 
uniform prior onk e [0.5, 2.5] will not provide an accurate measure 
of the escape velocity. To estimate a better choice of prior from 
these stellar likelihood intervals, we take the same range of k but 
centre it on the mean likelihood of 3.7, i.e. k e [2.7,4.7]. 

At first sight it may appear from Fig. |2] that the constraints 
on k for galaxies KIAl and KIA2 differ slightly from the other 
two galaxies. However, this can be understood when one considers 
the fact that KIAl and KIA2 have fewer total numbers of stellar 
particles than KIA3 or KIA4. As a consequence the constraints on 
k appear to be shifted towards higher values for these galaxies. This 
is simply due to the fact that when one has fewer particles, although 
the regime — » is still well constrained (since negative k is very 
strongly disfavoured), it is harder to constrain large values of k. 
Given this fact, an alternative approach to determine a prior could 
be to take the range covered by the joint 90 per cent confidence 
interval from KIA3 and KIA4 alone. If we do this we obtain a range 
k e [2.3, 4.7], which is very similar to that found above. Given the 
fact that there is very little difference, we choose to adopt the former 
range (i.e. k e [2.7, 4.7]) for the subsequent analysis^ 

It is important to check whether the results presented in Fig. 
[2] are sensitive to our choice of rescaling, since there are a variety 
of different rescalings that one can adopt. To test this we adopt an 
alternative rescaling that enforces a circular velocity of 220 km s"' 
at 8.5 kpc. This results in a rescaling factor of 1.1 1, 0.78, 0.81, 0.83 
for galaxies KIA1-KIA4, respectively, which is not far removed 
from our original values (see Table[Tll. Reassuringly, when we re- 
peat the above analysis the resulting likelihood estimates for k are 
wholly consistent for each of our simulated galaxies. 



3.2 The choice of Vmin 

3.2.1 Estimating the level of contamination from the disc 

We wish to address the choice of v,^i„, i.e. the minimum velocity 
that is included in a sample of 'high velocity' stars. For their anal- 
ysis of radial velocities, LT90 chose a value of 250 km s"', while 
K96 chose 300 km s"', preferring to reduce any systematic errors 
at the expense of increased statistical errors. When considering the 
choice of v^i„ it is crucial to note that the analysis discussed in Sec- 
tion|2]relies on the assumption that the velocities are isotropic and 
there are no mean motions. This is especially important because 
RAVE fields are not isotropically distributed on the sky; if there is 
any net rotation then the averaging over the unknown tangential ve- 
locities performed in equation Jsj is in error. While we might expect 
the mean motion of the halo stars to be negligible compared to the 
relatively large statistical errors that will be present in any current 
analysis, the possible presence of the rotating thin- and thick-disc 
population in any sample cannot be ignored. 

In the absence of any additional information (such as metal- 
licities, etc), the best discriminator which can be applied to radial 
velocity samples is one based on the radial velocity itself. Given 
the expected small dispersion in velocities of disc stars, one would 
hope that a sufficiently large value of will be able to filter out 



^ If we repeat the subsequent analysis using this prior, we obtain the fol- 
lowing constraints on v'esc 494 km s"' < I'esc < 605 km s"' (90 per cent 
confidence), with a median likelihood of 540 km s"' . These constraints are 
practically indistinguishable from those found by adopting the prior using 
all four galaxies (see Section|5). 
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the disc population. We will investigate this issue by first estimat- 
ing the fraction of disc stars that we may expect in our high velocity 
sample. To do this we construct a simple toy model containing each 
of the three components, i.e. the thin and thick discs and halo. 

Before one can estimate the fraction of disc stars that may be 
present in our high velocity sample, one first must ask what the 
total fraction of disc stars will be over all velocities. From the lit- 
erature we find that in the solar neighbourhood the mass density 
of the three components can be estimated as 3.8 x 10"^, 1.3 x 10"^ 
and 1.5x 10"'' Mg pc"' for the thin disc, thick disc and halo, respec- 
tively (JahreiB & Wielen 1997; Fuchs & JahreiB 1998; Ojha 2001). 
However, our sample of stars will be drawn from a finite volume, 
and so we would expect to find significantly more halo stars than 
this value of 0.4 per cent. To obtain the final fraction we need to 
know how the mass density of the components varies spatially and 
also the approximate boundary of our sample volume. We take a 
value of 2 kpc for this latter quantity and estimate the former by 
assuming that the mass density has negligible radial dependence in 
the plane and is solely determined by the vertical scale-height ( for 
which we adopt the values of 260 and 860 pc for the thin and thick 
discs, respectively, consistent with the local normalisations; Ojha 
2001). We assume that the halo density is constant. These assump- 
tion should be valid provided our RAVE sample volume is not too 
large. The final property that we include is that our RAVE sample 
excludes stars within 25 degrees of the plane. If we fold all this 
information into a toy model, then the resulting value for the frac- 
tional contribution of the three components before considerations 
of kinematics becomes 73.5, 19.5, and 7.0 per cent for the thin disc, 
thick disc and halo, respectively. Clearly we cannot calculate these 
fractions with any high degree of certainty since we have not ac- 
counted for the luminosity function of the three components and 
also the local mass densities are subject to much discussion. How- 
ever, we believe that these values provide a reasonable estimate and 
should be suitable for our purposes. 

The final ingredient necessary for calculating the expected 
level of contamination from disc stars is to include the kinematics 
of the three components. This is done by simply assuming Gaussian 
velocity distributions with mean orbital rotation and dispersions as 
given in Table|2] The halo parameters are taken from Chiba & Beers 
(2000). Although their value of (v^ = 40km s"' is in conflict with 
our assertion that the halo must have no net rotation, this value is 
clearly small compared to the dispersions. The thin- and thick-disc 
dispersions are deduced from the data of Nordstrom et al. (2004) 
following the prescription of Binney & Merrifield (1998, section 
10.4; see also Reddy, Lambert & Allende Prieto 2006); namely that 
thin disc stars satisfy age<10 Gyr, [Fe/H] > -0.4, while thick-disc 
stars satisfy age>10 Gyr, [Fe/H] < -0.4. 

Given this toy model we can now estimate the contribution of 
each component in a sample of stars with radial velocity greater 
than some cut-off Vn,in. To do this we take each observed star in 
the RAVE catalogue and generate 100 mock stars along this line- 
of-sight according to our toy model. We can then calculate the 
fractional contribution of each component as a function of i'„,in> 
which is shown in Fig. [3] From this figure it can be seen that 
for I'min > 250 km s"' the contamination from the thin disc is 
negligible. However, thick-disc stars will be present for veloci- 
ties as high as 400 km s"'. As stated above, K96 chose a value of 
Vmin = 300 km s"', for which our toy model predicts a contamina- 
tion of approximately 10 per cent. The contamination rises to ~ 20 
per cent if vviin = 280 km s"' and ~ 40 per cent if = 250 km s"' . 
Although the optimal value of v,^i„ is dependent on the line-of-sight 
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Table 2. Velocity dispersions of the three components in the toy model of 
Section lX2l The halo dispersions are taken from Chiba & Beers (2000). The 
thin- and thick-disc dispersions are deduced from the data of Nordstrom et 
al. (2004) following the prescription of Binney & Merrifield (1998; section 
10.4); namely that thin disc stars satisfy age<10 Gyr, [Fe/H] > -0.4, while 
thick-disc stars satisfy age>10 Gyr, [Fe/H] < -0.4. We adopt a value of 
220 km s" ' for the local standard of rest and take the solar motion from 
Dehnen & Binney (1998). 
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Rfidial velocity cut (v^^^^) 

Figure 3. This figure shows the fractional contribution of each Galactic 
component as a function of Vmin, i.e. the minimuin radial velocity of the 
sample. These results come from the toy model described in Section |X21 

(for example as a function of longitude), we follow previous studies 
and do not adopt a spatially varying Vmin- 

3.2.2 Quantifying the effect of disc contamination 

We now need to test whether this level of contamination will af- 
fect our results. To do this we return to the simulations of Abadi 
et al. (2006) that were introduced in Section [3T| Since the disc 
components in these galaxies do not always provide a wholly accu- 
rate representation of the Milky Way disc, we instead include the 
contamination from the thick disc according to the toy model de- 
scribed above. We investigate two fiducial cases, namely thick-disc 
contamination of 10 per cent and 20 per cent (note that for velocity 
cuts of v,ni„ > 250km s"' we predict that the contamination from 
the thin disc will be negligible). For the halo component we take the 
'spheroid' star particles as defined in Section [STI In addition, we 
restrict ourselves to the analysis of only two of our simulated galax- 
ies (KIA3 and KIA4) since they have significantly more particles 
than the other two. In order to make a fair comparison between the 
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two galaxies we rescale the velocities of the halo component so that 
I'esc = 600 km s ' and ensure that each sample contains 100 stars. 
For the minimum velocity cut we take a value of I'^in = 300 km s"' . 

Given these two samples of simulated stars with 10 and 20 
per cent thick-disc contamination, we evaluate equation l|9ll- The 
results from this analysis indicate that the presence of a thick-disc 
population has a noticeable, but not hugely significant, effect. The 
peak of the likelihood distribution is shifted towards larger k and 
I'esc, although for a given value of k the predicted value of Vesc is 
lower. Therefore, when one applies the prior calculated in Section 
13.11 (i.e. k e [2.7,4.7]) the predicted escape velocity is reduced. 
For a thick-disc contamination of 20 per cent, this shift is approx- 
imately -30km s"' for both KIA3 and KIA4; for a contamination 
of 10 per cent this shift is reduced to -15kms"' for KIA3 and 
-10km S-' forKIA4. 

In conclusion, we concur with K96 and propose that a value of 
I'rain = 300 km s"' is a safe choice which should allow us to obtain 
a reliable and robust determination of the escape velocity. 

4 DATA 

The following sections describe the data that we will use to con- 
strain the escape velocity. 

4.1 The RAVE project 

The Radial Velocity Experiment (RAVE) is an ambitious survey to 
measure radial velocities and stellar atmosphere parameters (tem- 
perature, metallicity, surface gravity) of up to one million stars us- 
ing the 6dF multi-object spectrograph on the 1.2-m UK Schmidt 
Telescope of the Anglo-Australian Observatory (AAO). The RAVE 
survey is ideal for constraining Vesc because it is a magnitude lim- 
ited survey (9 < / < 12), which means that it avoids any kinemati- 
cal biases, unlike catalogues constructed using high proper-motion 
stars. Given the RAVE magnitude range, the catalogue will consist 
of giants up to a distance of a few kpc and local dwarfs. The project 
is described in detail in the paper accompanying the first data re- 
lease (Steinmetz et al. 2006). It is foreseen that the RAVE project 
will run until 2010. This survey will represent a giant leap forward 
in our understanding of our own Milky Way galaxy, providing a 
stellar kinematic database larger than any other survey proposed 
for this coming decade. For our analysis we use an internal data 
release (Summer 2006) containing radial velocities for over 50,000 
stars covering an area of nearly 8000 square degrees. 

4.2 Our high radial velocity RAVE sample 

To construct a catalogue of high radial velocity objects we must 
first convert our heliocentric velocities into Galactic standard of rest 
frame velocities (also known as Galactocentric radial velocities; see 
for example equation (10-8) of Binney & Tremaine 1987). This is 
done by taking the Local Standard of Rest to be 220 km s"' and 
the solar motion to be (10.00, 5.25, 7.17) km s"' (Dehnen & Bin- 
ney 1998). Note that unless stated otherwise, all velocities will be 
quoted in the Galactic standard of rest frame. We then take all stars 
greater than Vn,i„ = 300 km s"' . 

It is vital to ensure that our sample is free from contamina- 
tion from spurious measurements. To this end we enforce a high 
threshold for the value of the Tonry-Davis (Tonry & Davis 1976) 
correlation coefficient between the observed spectrum and the tem- 
plate spectrum (R > 10). We perform additional checks to verify 



the reliability of the data in the following section. It should also 
be noted that currently the metallicities and gravities have not been 
accurately determined for the RAVE data, so we will only use the 
radial velocities in our analysis of the escape velocity. 

A total of 16 stars pass these criteria. Two of these have 10 < 
R < 15, but the velocities for both stars have been confirmed by 
follow-up observations (see the following Section). 

4.3 Data validation 

This subsection discusses the methods that have been adopted to 
verify the reliability of the RAVE radial velocity measurements of 
our sample of 16 high velocity stars. For a comprehensive discus- 
sion of the quality of the RAVE data in general, we refer readers to 
the data-release paper (Steinmetz et al. 2006). 

We have obtained follow-up observations of a selection of 
RAVE stars using the 2.3m Advanced Technology Telescope at 
Siding Springs Observatory (Steinmetz et al. 2006). The data were 
taken using the Double Beam Spectrograph, giving slit spectra be- 
tween 8000 - 8900 A at a resolution similar to the RAVE ob- 
servations (0.55 A/pix). In general, the radial velocity measure- 
ments show reasonable agreement with the RAVE values to within 
~ 3kms"'. Given the fact that we wish to use only the highest 
radial velocity stars for our Vesc analysis, this level of discrepancy 
is negligible. Among this selection of re-observed stars there were 
12 of our high radial velocity RAVE stars, all of which show good 
agreement. A smaller subsample of RAVE stars were also observed 
with the echelle spectrograph on the 3.5m Telescope at Apache 
Point Observatory (APO), New Mexico. The single-slit echelle 
has a spectral resolution of 37,000 and covers the entire optical 
wavelength range (3500A-l/j;;i). Nine of the observed stars were 
high-velocity stars. Similarly to the 2.3m data, the uncertainties 
are dominated by RAVE errors, and the radial velocity measure- 
ments agreed with the RAVE velocities within 3cr. In Fig. |4] we 
show the results of the follow-up observations for these 1 1 stars. 
All show good agreement, with none having a discrepancy of more 
than 7 km s"' . 

Although we found good agreement for our follow-up veloc- 
ities, there were two exceptions not mentioned above. Two of our 
stars showed behaviour consistent with being binary stars. One of 
these stars (C2 129509-0804 18) has 4 observations taken over a pe- 
riod of 18 months, while another (C1437570-280154) has 2 obser- 
vations taken over a period of 26 months. From our measured ve- 
locities we can place a lower limit on the semi-amplitude of the ve- 
locity variations and we find values of 10.6 km s"' and 15.4 km s"' 
for C 1437570-280 154 and C2129509-080418, respectively. These 
values are typical of single-lined spectroscopic binaries in Latham 
et al. (2002), for which 68 per cent have semi-amplitudes within 
the range 3 kms"' to 15 kms"'. Therefore if we assume that the 
actual semi-amplitude for our two RAVE stars is close to our ob- 
served lower limit, we can estimate the centre of mass velocity by 
averaging the highest and lowest velocities for each star. Given the 
statistics of the Latham et al. (2002) sample, in particular that less 
than 10 per cent of their binaries have semi-amplitude greater than 
20kms"', we can predict that the probable error for the estimated 
velocity of our RAVE stars is less than 10 kms"'. Both of these 
centre of mass velocities are above Vn,in and hence we retain these 
for our analysis. Our fraction of binary stars (2 out of 16) is com- 
parable to the fraction of single-lined spectroscopic binaries found 
in the Latham et al. (2002) sample (171 out of 1359). 

In summary, out of our final sample of 16 RAVE stars with 
Galactocentric velocity greater than 300 km s"' , 14 have repeat ob- 
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Figure 4. Results from the follow-up work described in Section l43] for 12 
of our high radial velocity RAVE stars. The horizontal axis shows the dif- 
ference between the velocity as reported in the RAVE catalogue compared 
to the weighted mean of all velocities taken during the follow-up campaign. 
Note the good agreement between the two measurements. Typical errors 
are ~ 2km s"' for the RAVE catalogue and < 1 kms"' for the follow-up 
velocities. 



Figure 5. The relation between radial velocity (corrected for Solar mo- 
tion) and longitude for stars in the RAVE catalogue. Note that the sig- 
nature of the disc is clearly visible. The horizontal lines correspond to 
IV = -300, -250, 0, +250, +300 km s"' . The crosses simply denote stars 
with > 250 km s"'. 



servations (including the two binary stars). The total number of ob- 
servations for each star varies from between one and four (see Table 



4.4 The final high-velocity RAVE sample 

Given this high quality RAVE data, we are now able to construct a 
final catalogue of high radial velocity stars. Since many of our stars 
now have repeat observations, we choose to adopt the weighted 
mean of all measurements as our definitive velocity, with the ex- 
ception of the two binary stars for which we give our estimate 
of the center of binary mass motion. These are tabulated in Ta- 
ble [3] and the velocity distribution is shown in the inset of Fig. 
|6] In Fig. [5] we show how the radial velocities vary as a function 
of Galactic longitude. This plot clearly shows the signature of the 
Galactic disc and from this one can obtain an understanding of 
why a value of v,nin ~ 250 km s"' results in significant contami- 
nation from the disc; if the mean rotational velocity of our sample 
is close to zero (as we would like for a halo population), then there 
should be an equal number of stars with positive and negative ra- 
dial velocity for a given longitude. However, for I x 270 there is 
clearly a greater number of stars with radial velocities in the range 
Vj 6 (-300,-250) compared to i', 6 (250, 300), indicating contam- 
ination by a rotating component. Note that this asymmetry is not 
evident for stars with |vv| > 300 km s"', which supports our choice 
of v'lnin = 300 km s"'. 



4.5 Augmenting our high velocity sample with stars from 
archival surveys 

Since we would like our sample of stars to be as large as possible, 
we incorporate additional stars from the Beers et al. (2000) cata- 
logue of metal poor stars. It is important to note that this sample 
is kinematically unbiased, which is important if we are to com- 
bine datasets in this way. This sample is ideal since it contains 
metal poor stars, which are preferentially halo stars. The Beers et 
al. (2000) sample provides a total of 17 stars faster than 300 km s"' , 
once we have removed three stars for which the distance estimate 
indicates that they are further than 5 kpc away (all of the retained 
stars have distances of less than 2.5 kpc). These archival stars are 
given in Table IB 1 1 

This brings the total number of stars in our full augmented 
sample to 33, which is a significant improvement on the number 
of stars used in LT90 (15 stars with v, > 250 km s"') and K96 (10 
stars with Vy > 300 km s"'). 

The velocity distribution for this larger augmented sample is 
shown in Fig.|6] Note that the inset of this figure compares the dis- 
tribution of RAVE stars with the distribution of our archival stars 
from Beers et al. (2000). The Kolmogorov-Smimov test indicates 
no significant discrepancy between these two distributions (14 per 
cent probability that they come from different distributions), so 
there is no inconsistency in combining the two data sets. In ad- 
dition, similar to the RAVE sample (as was shown in Fig. |5](, we 
reassuringly find no correlation between radial velocity and Galac- 
tic longitude. In Section|5]we check that the process of combining 
datasets does not introduce any obvious bias by carrying out the 
likelihood analysis on both the combined sample as well as a sam- 
ple consisting solely of our 16 RAVE stars. 
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Table 3. The high velocity sample of RAVE stars. See Section|4]for details about these stars. The identifiers are taken from the RAVE input catalogue, while 
the /-band magnitudes are from the DENIS catalogue (Epchtein et al. 1997). Note that the magnitudes of three stars, C1437570-280154, C2129509-080418 
and T7524.00065.1, are not available in the DENIS catalogue and so the values for these stars are taken from USNO-B (Monet et al. 2003). Column one 
shows the Galactocentric radial velocity and column two shows the error in the velocity; where we have more than one observation for a star we adopt the 
weighted mean. Two stars in our sample are believed to be binaries (C1437570-280154 and C2129509-080418) and for these we give an estimate of the center 
of binary mass motion (see Section l43t . 




Figure 6. The cumulative distribution for the magnitude of the radial veloc- 
ities. The dashed line shows the velocities for our sample of high radial ve- 
locity stars, which has been constructed by augmenting a sample of RAVE 
stars with a supplementary archival survey (Beers et al. (2000); see Sec- 
tion|4). The solid line coiTesponds to the maximum likelihood evaluated in 
Section[5](i'esc = 616kms"', k = 7.1). The inset shows the comparison be- 
tween the 16 RAVE stars (solid) and the 17 archival stars (dotted), showing 
no clear discrepancy between these two samples. 

5 RESULTS 

We now wish to use the sample of high velocity stars described 
above to constrain the local escape velocity. To do this we apply 
the techniques outlined in Section|2l 



5.1 Maximum liltelihood analysis of the sample 

Evaluating equation ^ results in 2-dimensional likelihood con- 
tours for k and v^^^, shown in the lower panel of Fig.|7] The peak of 
the likelihood lies at Vesc = 616km s"' and k = 7.1. Although this 
value of k is slightly greater than that predicted from our cosmolog- 
ical simulations in Section [37T1 we find that our peak likelihood is 
relatively broad. In addition, as was discussed in Section [3.2.2l al- 
though our predicted level of thick-disc contamination should have 
little effect on our v^^^ constraints, any contamination will shift the 
peak likelihood to higher values of k. In Fig. [6] we show the prob- 
ability distribution for the radial velocities (given by equation O 
corresponding to this peak likelihood. 

As can be seen from Fig. [7] given the size of our sample of 
high radial velocity stars it is not possible to constrain simultane- 
ously both Vesc and k using this method. 

From this figure we can see that there is a degeneracy between 
k and Vesc- This can be understood as follows: to retain the same 
shape of the velocity distribution over the observed range of veloc- 
ities, an increase in v'esc must be accompanied by an increase in the 
steepness of the slope of the distribution (i.e. k). Given this degen- 
eracy, we need to apply a suitable prior on k to obtain constraints 
on Vesc. The results from Section [37T] indicate that a suitable choice 
is to adopt a uniform prior in the range k e [2.7,4.7]. We have 
done this, and the results are shown in the top panel of Fig.[71 The 
resulting 90 per cent confidence interval is, 

498kms"' < Ve^c < 608kms"', (11) 

with a median likelihood of Vesc = 544 km s"'. Note that despite 
the degeneracy between k and Vesc, it is immediately evident that 
negative values of k are strongly disfavoured. 

It is important to check whether any kinematical bias is intro- 
duced by combining data from separate surveys. To do this we re- 
peat the above analysis using only the 16 RAVE stars from Section 
14.41 This is shown by the dotted quantities in Figure [T] Reassur- 
ingly there is no noticeable offset between the two sets of contours; 
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Figure 7. The lower panel shows the 2-dimension likelihood contours that can be placed on the local escape velocity (I'esc) and the shape of the velocity 
distribution (k\ see Section |2) from our combined high-velocity sample. The cross corresponds to the peak likelihood, while the contours denote 10 and 1 
per cent of this peak likelihood value. The upper panel shows the likelihood distribution for I'esc obtained by assuming a uniform prior on A: 6 [2.7,4.7] (see 
Section IxTt : the con'esponding error bar shows the 90 per cent confidence interval. The dotted quantities show the results from a sample containing only the 
high- velocity RAVE stars, i.e. a smaller sample of 16 stars. 



the only difference is a general broadening of the contours. When 
we apply the prior k e [2.7, 4.7] we find that the 90 per cent con- 
fidence interval becomes 496 < Vcsc < 655 km s"', with a median 
likelihood of v^^^ = 556 km s"' . 



5.2 Bootstrap analysis 

To further assess the likelihood constraints presented in the previ- 
ous section, we also apply the bootstrap technique (see Section [Z2t 
to our data. 

We apply the bootstrap approach to the combined dataset of 
33 stars, but unlike Section ISTI we apply the LT90 approximation 
(equation |6]l when calculating the maximum likelihood. The boot- 
strap computed the values of v^sc and k that maximized equation 
(|9j using 5000 resamples of the original RAVE sample. Table |4] 
shows the resulting values of i'esc and k for the two chosen priors 
(see Appendix [At. 

When we compare the bootstrap interval with the likelihood 
interval obtained in Section ISTI we find that the interval is shifted 
towards smaller Vesc. This is consistent with what one would expect 



for the bootstrap method, since (unlike the method described in 
Section lSTt the process of bootstrapping can result in values of v^^ 
that are smaller than the highest velocity star in the sample. This 
is a consequence of the fact that the bootstrap approach accounts 
for possible unreliable or inconsistent data. However, it is also im- 
portant to note that the bootstrap mean values of k and Vesc found 
with both priors are identical, within standard errors, to those found 
in the previous section using the non-bootstrap technique with the 
LT90 prior. Figure [8] shows the bootstrap distributions and corre- 
sponding confidence intervals calculated for k and v^^c when each 
prior is applied to equation The dashed curves correspond to 
Prior 1, while the solid curves correspond to Prior 2. The confi- 
dence intervals obtained using Prior 2 are clearly smaller than that 
from Prior 1, owing to the fact that Prior 2 contains more informa- 
tion about our expectations of k. 

As a result of our analyses with a simulated dataset (see Ap- 
pendix|Aj, we adopted the confidence regions from Prior 2, obtain- 
ing the bootstrap 90 per cent confidence intervals 462 kms"' < 
Vesc < 640 kms"' and 0.1 <k < 5.4. 
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Label Prior Form Initial MLE Bootstrap SE 90%-Conf. 

Values Mean Values 

1 PLr(Vesc, 1/vesc = 588.7 km s"' 569.9 98.3 (445.0 769.4) 

it = 4.2 4.1 3.0 (0.3,9.0) 

2 Pj{v^,c,k)oc j-—^&^-j= vesc = 535.2 km S-' 521.5 52.5 (462.0,640.0) 

k = 2.9 2.7 1.6 (0.1,5.4) 



Table 4. Bootstrap results for combined data with i'„,in = 300 km s" . The third column gives the maximum likelihood values of x'esc and k for the original 
(non-bootstrapped) sample, while the fourth column gives the mean of the bootstrap distribution of maximum likelihood values (Note that the means quoted 
here for Prior 1 differ from those given for the maximum likelihood method in Section lsTl since the former employs the LT90 approximation, i.e. equation|6). 
The SE column gives the standard errors from the standard deviation of the bootstrap distribution. The last column gives the 90 per cent confidence intervals 
for I'esc and k. The I'esc confidence intervals are computed using the bootstrap normal-distribution approximation, while the k intervals are calculated from a 
chi-squared disfiibution with the number of degrees of freedom equal to the bootstrap mean value. 
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Figure 8. The bootstrap distributions for k (top panel) and I'esc (lower 
panel). The dashed and solid curves coirespond to the bootstrap distiibu- 
tions calculated when solving equation (9) with Priors 1 and 2, respectively. 
The corresponding error bars show the 90 per cent confidence intervals for 
the parameters for each prior. Note that for both parameters. Prior 2 gives 
the smallest confidence interval. Furthermore, note that the bootstrap distri- 
bution for k follows a chi-squared distribution. 

6 DISCUSSION 

6.1 The nature of the fastest RAVE stars 

One might wonder about the nature of the fastest stars in the RAVE 
catalogue. Although we do not currently possess any additional in- 
formation from the RAVE spectra, such as metallicities, we do have 
estimates for each star's proper motion from various sources (Stein- 



metz et al. 2006). When combined with accurate photometry (in 
our case, J- and A'-band magnitudes from 2MASS [Skrutskie et 
al. 1997]), proper motions can be used to place the stars on a re- 
duced proper motion (RPM) diagram. A RPM diagram can be used 
to differentiate types of stars such as dwarfs and giants of different 
stellar populations because it is closely related to a standard colour- 
magnitude diagram, modified by kinematics. The y-band RPM is 
given by 

^RPM = J + 51og^ + 5 = Mj+ Slog ''7r""°' 1 + 5. (12) 

47.4 km s ' 

where ji is the proper motion in arcsec yr"' and v,angentiai is the tan- 
gential velocity corresponding to this proper motion. Thus one ob- 
tains a quantity that is related to the absolute magnitude, but with 
stellar populations of different mean motions offset from one an- 
other, with scatter induced by the velocity dispersion. 

We show the RPM for our fastest RAVE stars in Fig |9] 
Three Bonatto et al. (2004) isochrones, adjusted for different mean 
Viangentiai, slk also plotted. The three isochrones have ages and 
metallicities corresponding to different assumptions regarding the 
kinematics: namely thin disk, thick-disk and halo populations. The 
thin solid isochrone has been plotted assuming thin disk kinemat- 
ics, with tangential velocity v,„„g„„j„i = 20kms"', Z = 0.019, and 
age= 2.5 Gyr. The dashed isochrone represents the thick-disk com- 
ponent, Viangentiai = 42kms"', Z = 0.004, age= 10 Gyr. Finally, the 
thick solid isochrone depicts the halo with i'M,ij;mna; = 200 km s"', 
Z = 0.001, and age= 10 Gyr. In a probabilistic assignment of any 
one star to a given population, one would take account of the ex- 
pected cr ~ 150 km s"' spread around the halo isochrone. 

The RAVE high-velocity stars clearly scatter around the red 
giant branch of the halo isochrone; this association is even stronger 
for the highest- velocity stars, those with i', > 300 km s"', denoted 
by the triangle symbols. This is reassuring because it indicates that 
our sample should not suffer from significant contamination from 
the thick disc, thus justifying our choice of v^in in Section [X2l Fur- 
thermore, using the echelle data from APO (section |43t . we were 
able to calculate gravities and temperatures for a few of the high- 
velocity stars. These gravities were low, confirming giants. The 
bluest star appears from its spectrum to be a halo blue horizontal 
branch (BHB) star|] 

If these stars are indeed mostly halo giants, then they are not in 
the immediate solar neighbourhood. This could have implications 

3 The BHB star has identifier, C 11 00242-024226. 
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Figure 9. J-Band reduced proper motion diagram of RAVE stars selected 
to liave Galactocentric radial velocity greater than 250kms"'. The trian- 
gle symbols represent stars with Galactocentric radial velocity greater than 
300 km s" ' . Note that the errors in the photometry are smaller than the sym- 
bols. Isochrones from Bonatto et al. (2004) are overplotted, with different 
tangential velocities adopted, to represent the thi'ee main stellar components 
of the galaxy. The thin solid Hne is appropriate for the thin disk (Z = 0.019, 
age= 2.5 Gyr, v,angeniial = 20kms"'), the dashed Hne describes the thick 
disk (Z = 0.004, age= 10 Gyr, v,angeniial = 42kms"'), and the thick solid 
line desciibes the halo (Z = 0.001, age= 10 Gyr, i'ro«g™nn/ = 200 kms"'). 
The majority of the stars have positions in this diagram consistent with be- 
ing halo giants. However, the star with J - K corresponding to ~ 0.15 is 
most likely a BHB star. 

for our determination of t'csc and so we investigate this issue in the 
following section. 

It is also worth noting that our sample contains three rela- 
tively high proper-motion stars with /j > 50 mas yr"' (C1022369- 
140345, C1250398-030748 and T8395_01513.1). These all appear 
to be halo dwarfs or relatively nearby halo giants. By calculating 
distances from either our follow-up spectra or from the crude dis- 
tance calibration of Scholz, Meusinger & JahreiB (2005) we find 
that their total 3D velocities are all well within our value of v^^c ■ 

6.2 The efTect of sample volume on the escape velocity 

In the Section lSTI we used a sample of stellar particles from cosmo- 
logical simulations to analyse our ability to recover the escape ve- 
locity. In that section our solar neighbourhood volumes were con- 
structed from spheres of radius 2 kpc centred at 8.5 kpc (and also 
equivalent spheres of radius 1 kpc and 3 kpc centred at 4.25 kpc 
and 12.75 kpc, respectively, thus preserving the ratio of sample ra- 
dius to distance from the Galactic centre). However, the catalogue 
of radial velocity measurements used to try to constrain the Milky 
Way escape velocity could contain stars at significantly larger dis- 
tances. The magnitude limit of the RAVE survey is / = 12 mag, 
which implies that very bright giants may be a significant distance 
away; for example, a K2 giant with M/ a -0.9 mag could be at a 
distance of around 4 kpc. Indeed, Section [6ni suggested that some 
of these fastest RAVE stars may well be halo giants. 

We test whether our results are sensitive to the sample volume 
by analysing the KIA3 and KIA4 simulations introduced in Sec- 
tion |3.1| (we restrict ourselves to these two galaxies as they have the 
largest number of stellar particles). We generate samples of stellar 
particles by taking spheres of varying radius centred on 8.5 kpc. 



From these samples we can estimate the escape velocity by eval- 
uating equation ([9]l, adopting the prior determined in Section [3711 
(i.e. k e [2.7,4.7]). This analysis shows how the recovered escape 
velocity varies as a function of the radius of the 'solar neighbour- 
hood' spheres. We find that for larger spheres the escape velocity 
is slightly over-estimated: the error in Vesc is < 15 per cent for all 
radii up to 5 kpc, with an error of < 10 per cent for radii of less than 
4 kpc. This over-estimation is probably due to the fact that larger 
volumes include stars that probe regions of higher Vesc and hence 
result in larger estimates for Vesc- It should also be noted that in re- 
ality this is an upper-limit since only a small proportion of our sam- 
ple of observed stars will probe such large distances, whereas these 
simulated samples do not incorporate any such distance effects (i.e. 
the probability of selecting a simulated star is independent of its 
distance). 



6.3 Unbound stars 

One obvious deficiency of the analysis presented in Section |2] is 
that it requires all stars in our sample to be bound to the Galaxy. It 
is clear that the presence of any unbound stars would invalidate our 
results, although the bootstrap technique should reduce our sensi- 
tivity to any spurious datapoints. There are currently seven known 
'hyper- velocity', probably unbound, stars in our Galaxy (Edelmann 
et al. 2005; Hirsch et al. 2005; Brown et al. 2006). The Galactic 
standard of rest frame radial velocities computed for these stars 
range from ~ 550 km s"' to ~ 720 km s"' . The first few were found 
serendipitously while the most recent were found by a targeted sur- 
vey of faint early-type stars (Brown et al. 2006); indeed all of the 
known hyper-velocity stars are blue and therefore almost certainly 
young (Kollmeier & Gould 2007). 

It is suggested that such high velocities may originate from 
encounters with the massive black hole at the Galactic centre. How- 
ever, Yu & Tremaine (2003) found that the numbers of hyper- 
velocity stars produced would be about 10^'yr"' (see also Perets, 
Hopman & Alexander 2006). That would give about 10' stars in 
total, assuming a long-lived black hole, which suggests that hyper- 
velocity stars are very rare. Furthermore, they found that the num- 
ber of lower velocity 'high-velocity' stars expected is even smaller. 
Although such objects should be rare, and hence the chance of one 
appearing in our catalogue is small, this can nevertheless not be 
ruled out entirely. There is one simple test one can apply to inves- 
tigate this issue; if one can obtain the full 3 dimensional velocities 
for the fastest stars, then it can be ascertained whether their direc- 
tion of motion is consistent with that of an object ejected from the 
centre of the Galaxy. However, this method could fail if the un- 
bound star(s) originate from different circumstances, such as stars 
which are bound to the local group but unbound with respect to our 
Galaxy. 

High-velocity stars can also originate from stellar binary sys- 
tems in which one of the stars undergoes a supernova and 'kicks' 
out the companion star. This would have larger effects on main se- 
quence stars as they can acquire an extra velocity of ~ 100 km s"' . 
However, since our sample contains mostly halo giants (see Section 
16. It this should be a small effect. 

One final point to note is that given the consistency between 
our constraints on Ves^ with other independent methods for estimat- 
ing the mass of the Galaxy, the presence of any unbound stars in 
our sample seems to be an unlikely scenario. 
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6.4 The mass and extent of the Galactic halo 

The escape velocity can be used as a powerful tool to probe the 
mass distribution in the Galaxy. Our constraints on v^sc allow us to 
constrain the halo potential at the Solar radius, 



«,o.al('-o) = -KJ2. 



(13) 



Halo masses derived from our constraints on Vesc rely on the as- 
sumption that our fastest stars probe velocities all the way to the lo- 
cal escape velocity, i.e. there is no truncation in the stellar velocity 
distribution. If this assumption is invalid then the mass constraints 
quoted in this section would be lower-limits to the true mass. How- 
ever, as has been shown for the simulated galaxies in Section [3T| 
the level of truncation in the stellar component cannot be more than 
10 per cent. When the likelihood analysis (equation|9]l is applied to 
these simulations we do not find any systematic underestimation in 
the recovered Vesc, which gives us confidence that such weak levels 
of truncation (if present) will not pose significant problems for our 
analysis. 

extend up to When we determine the escape velocity for the s 
the maximum likelihood estimator for 

In this section we will use v^^c to constrain the total mass of the 
halo. However, before undertaking any detailed calculations, one 
obtain a qualitative understanding of the situation from the simple 
relation (see, for example, equation 2-22 of Binney & Tremaine 
[1987]), 



p(r)rdr. 



(14) 



where v^-^^ is the circular velocity at the Solar radius. The fact 
that our measured lower-limit of Vesc > 498 kms"' is signifi- 
cantly greater than V2vcirc ~ 31 1 km s"' shows that the second 
term in equation l ll4t cannot be small, i.e. there must be a sig- 
nificant amount of mass exterior to the Solar circle. This simple 
argument convincingly demonstrates the presence of a dark halo in 
the Galaxy. 

Another straightforward calculation that can be made is the 
extent of an isothermal halo that has a constant circular velocity 
out to a truncation radius r^ut. For this elementary model we find 
that, 



'b exp 



2vl. 



- 1 



(15) 



Our measured value of v^x (I'csc ~ 544 km s" ) corresponds to a 
truncation radius of r^^f x 58 kpc. 

The above simple arguments demonstrate the existence of a 
dark halo, but we now undertake a more detailed calculation in or- 
der to constrain the total mass of the halo. To do this we first need 
a model for the baryonic contribution to the total potential from 
the bulge and disc of the Galaxy. We adopt a spherical Hernquist 
(1990) bulge with mass 1.5x IO'^Mq and scale radius 0.6 kpc, and a 
Miyamoto & Nagai (1975) disc with mass 5 x 10"'Mo, scalelength 
4 kpc and scaleheight 0.3 kpc. The halo potential is then simply. 



bulge 



(16) 



When Ohaio is combined with a halo profile it is possible to 
probe the total halo mass. One popular profile is the NFW halo 
(Navarro, Frenk & White 1996). It can be shown that the radial 
potential for an NFW density profile can be expressed as, 



*NFW('') 



ln(l + — ), 



(17) 



where c is a concentration parameter (c) equal to the ratio of the 
virial radius to the scale radius, and Ps is a characteristic density 
given by. 



ln(l + c) - c7(l + c) 



(18) 



where p^r = 3H^/S7tG is the critical density of the universe, Q.o is 
the contribution of matter to the critical density and (5,1, is the critical 
overdensity at virilisation. The virial mass can then be determined 
from the virial radius. 



47T 

M^ir = YPcno^th^vir- 



(19) 



If we take the solar radius to be 7.5 kpc, = 0-3, (5tii = 340, 
H = 65 km s"' Mpc"', and enforce the circular velocity at the solar 
radius to be Vcin.- = 220 km s"', our 90 per cent confidence con- 
straint on the escape velocity (equation II It allow us to obtain the 
following constraints on the virial radius, mass and concentration 
parameter for the Milky Way, 



Mvi,- 

r™ = 257!*^kpc. 



-'-0.29 ■ 
7+47 1 



: 24.3 



+6.5 
-5.1- 



It is worth noting that these constraints have little dependence on 
the adopted value for the solar radius; taking a value of 8 kpc will 
change the limits on Myi, and ryi,- by less than 1 per cent. 

One common variation of this classical NFW model is to ac- 
count for the adiabatic contraction of the dark matter halo due to the 
presence of the baryons in the disc and bulge (Mo, Mao & White 
1998). In this instance we simplify the calculation by adopting an 
exponential disc, retaining the same mass and scalelength as above. 
The resulting constraints on the virial mass, radius and concentra- 
tion are, 

M™- = 1.42!i;^* X lO'^Mo, 
'■vir = 305!* kpc. 



Both of these models predict low values for the virial velocity 
of the halo. The contracted NFW model has I'vi, = I42;^2[ kms"', 
while the uncontracted one has Vyir = 124!j|j km s"', i.e. around 
half I'circ- These low values of Vvi, pose problems for semi-analytic 
models of galaxy formation that require the peak circular velocity 
of the disc to be ~ Vyir in order to simultaneously explain the nor- 
malisation of the Tully-Fisher relation and the galaxy luminosity 
function in a ACDM universe (e.g. Cole et al. 2000). More recent 
semi-analytic models (Croton et al. 2006a, 2006b) have been able 
to relax this requirement slightly, so that the peak circular velocity 
of the disc is only required to be comparable to the maximum cir- 
cular velocity of the dark matter halo. However, it can be seen that 
our models are only marginally consistent with this requirement, 
e.g. for our contracted model, the peak circular velocity of the halo 
is only 167 kms"' while the peak circular velocity of the disc is 
220 kms-'. 

It is possible to work out similar constraints for other halo 
models. For example, one can take the model of Wilkinson & Evans 
(1999), which has a flat rotation curve in the inner regions and a 
sharp fall off in density beyond an outer edge (hereafter referred to 
as the WE model). For this halo profile the total halo mass is given 

by, 
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(20) 



where Owe is the scalelength. For this model we revert to our 
original Miyamoto & Nagai (1975) disc, as used for the uncon- 
tracted NFW profile. By again enforcing the circular velocity to be 
220 km s"' we can use this model to obtain the following 90 per 
cent confidence constraints on the total mass and scalelength, 



MwE = 1.89!{*,f X IO'-Mq. 

^+986 1 



flwE = 3i4r;^^kpc. 



It is clear that our constraints on the halo mass are consistent 
with previous estimates; within the last ten years estimates tend 
to lie in the range 1 - 2 x lO'^M© (see SectionlTJ, which is fully 
consistent with our findings. It is also important to note that our 
constraints are relatively model independent, with only small dif- 
ferences between the mass estimates for our three halo profiles. 

It is also useful to quote the total mass within a certain ra- 
dius of the Galactic centre. For example, our three models pre- 
dict the following constraints on the mass contained within 50 kpc: 
4.04+',^^ X 10" Mo, 3.87+j;«^ x 10" M^, 3.58!°'^:^ x 10" Mq, for the 
uncontracted NFW, contracted NFW, and WE model, respectively. 
These masses are relatively well constrained mainly due to the fact 



that our models must adhere to the v^i 



220 km s constraint. 



Wilkinson & Evans (1999) obtained a value of 5.4!!|^ x 10" Mq, 
which is consistent with our models. 

Given these halo models, we can estimate the maximum 
Galactic radius that our fastest star will reach. From Table[3]we see 
that the fastest star has a rest-frame radial velocity of -449 km s"' . 
We assume that the star's kinetic energy is dominated by its radial 
component, so 



E{r^) = (V? - vlj/2. 



(21) 



At apocentre the kinetic energy is solely due to the angular mo- 
mentum L, which we assume to be conserved. Assuming that the 
distance to the Sun is small, we have 



6.5 Uncertainties and the next steps 

As discussed in the Section|2l our results are based on the assump- 
tion that the velocity distribution near the escape velocity can be 
approximated by a power-law f(v) oc (vesc - |v|)*. In a hierarchical 
universe, this model is unlikely to be valid because the motions of 
the fastest moving stars are expected to be strongly clumped (Helmi 
et al. 2002). Since the velocity distribution would be dominated by 
very few streams, it would be the sum of a few delta functions cen- 
tered on the mean velocities of those streams. It may even be pos- 
sible that each stream originates in the same object, in which case, 
only one orbit is probed. This invalidates the use of statistical ar- 
guments to derive the mass of the Galaxy. Such a high-degree of 
lumpiness only is important when the fastest 1 per cent of the stars 
are considered, that is, those with velocities i', > 3cr ~ 450 km s"'. 

Notice that there are no stars in our sample with such large 
radial velocities. This is the reason why substructure appears not 
be an issue in our case (for example, the results of the bootstrap 
analysis obtained by re-sampling are consistent with the maximum 
likelihood performed using the actual dataset). However, a new ap- 
proach will be necessary in the future. By the time the RAVE survey 
concludes, it is intended that the catalogue will contain up to one 
million stars. If we assume that the rate of detecting high veloc- 
ity stars is unchanged for the rest of the survey, this would result 
in a final sample of ~ 200 stars with radial velocities greater than 
300 km s"'. Such a sample will reveal a large amount of substruc- 
ture. This should also be evident also when the RAVE metallicities 
are added. Gravity determinations, plus the metallicities, will allow 
robust distance estimates and hence the use of tangential velocities. 
Full phase-space information should enable one to perform more 
sophisticated models to determine the mass of our Galaxy. 

All of this highlights the importance of understanding the ex- 
pected shape of the velocity distribution near the tails (Section lJTt . 
The limited number of high-resolution cosmological simulations of 
galaxies like the Milky Way is also a source of uncertainty in our 
estimates of the escape velocity. Both higher resolution (a signifi- 
cantly larger number of stellar particles) and a larger set of simula- 
tions are necessary. 





• 7.5 kpc ' 




-v,.cos b cos / 




L = 





X 


I'iCos b sin / 


(22) 




I J 




\',.sin b 





The apocentre distance r^pa can now be found from 



2rL 



+ f (j-apo). 



(23) 



Equations i2U and l l23b can be solved using any of the models 
for the potential described above. For our constraints on Vesc de- 
termined in Section lSTT] we find that r^po = 102+^^2 kpc for our con- 
tracted NFW halo model. Since this calculation assumes that there 
is no tangential velocity, it is actually a lower-limit for r.^po', for ex- 
ample, if the star has a tangential velocity of 100 km s"' then r^p^ 
increases to ~ 122+3] ''P''- Although we do not know the distance 
to our fastest radial velocity star, by adopting a distance of 1 kpc we 
find that the observed proper motion (17.2 + 9.4 mas yr"' ; Hambly 
et al. 2001) corresponds to a tangential velocity of 82 ± 46 km s"' . 

The question of whether there is a truncation in the stellar halo 
is currently open to debate (e.g. Ivezic et al. 2000; Dehnen et al. 
2006). Therefore our large value for ^apo is important since it could 
disfavour Galactic models with an early truncation. 



7 CONCLUSION 

In this work we have reported new constraints on the local escape 
speed of our Galaxy. We argued that the choice of prior on k may 
have been incorrectly estimated in previous work and deduced a 
different range for this prior (see Section [3Tt . We then applied this 
to a catalogue of radial velocities from the RAVE collaboration, 
augmented with additional stars from the existing survey of Beers 
et al. (2000). Our results provide a 90 per cent confidence interval 
of 498 km s"' < Vesc < 608 km s"', with a median likelihood of 
I'esc = 544 km s"' . We also applied a bootstrap technique, which al- 
lowed us to reduce dependencies on extreme velocities that may be 
unbound or from stars in binary systems, while additionally allow- 
ing us to perform confidence analysis on the distribution of maxi- 
mum likelihood values, v^sc and k, simultaneously. This resulted in 
a 90 per cent confidence interval of 462 km s"' < v^sc < 640 km s"' 
and 0.1 <k< 5.4. 

The fact that i^c is significantly greater than 2v^j^^ implies 
that there must be a significant amount of mass exterior to the So- 
lar circle. Furthermore, from our constraints on Vesc we can infer 
model dependent estimates for the mass of the halo (see Section 
16.4b ; for example, an adiabatically contracted NFW halo profile 
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predicts that M^i, = 1.42+^^^ x IO'^Mq, = 305+^* kpc, and 
c = 7J'^2Q- Similar results for the halo mass were found for both 
an uncontracted NFW halo and a Wilkinson & Evans (1999) halo. 
Although our results are model dependent, we find that our three 
models are in good agreement. The mass within 50 kpc is found to 
be 3. 6-4. Ox 10" Mq. It is interesting to note that our models predict 
low values for the virial velocity, for example iVi, = 142^21 km s"' 
for the contracted NFW model. 

By the time the RAVE survey reaches completion (currently 
predicted to be 2010) it will provide an unparallelled database of 
stellar kinematics, thus allowing dramatic progress in this field. 
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APPENDIX A: THE PRIOR PROBABILITY 
DISTRIBUTIONS FOR THE BOOTSTRAP METHOD 

Al The choice of prior 

Al.l Jeffreys' rules 

It can be quite difficult to choose a priori probabilities for v^sc and 
k especially when not much is known about the quantities. Note 
that the posterior distribution (equation[9](, with uniform prior dis- 
tributions (i.e., -P(i'csc) = P(k) = 1), is equivalent to the normalised 
likelihood equation. Furthermore, even if the prior probabilities are 
non-uniform, the maximization of equations ([Sj and ^ are asymp- 
totically equivalent. This is due to the fact that as n goes to infinity, 
the product in equation (|9} will dominate. However, for small n, 
there may be large differences between the parameter values max- 
imizing /(I'esc, k) and those maximizing the posterior distribution. 
Therefore it is important, for small n, to choose 'good' reference 
priors for I'esc and k. 

One method that can be used to attain reference priors is that 
defined by Sir Harold Jeffreys (Jeffreys 1961). First define the log 
of the normalised likelihood equation (equation [S]! as L(Vesc,^). 
Next, we define a 2 x 2 information matrix, known as the Fisher 
Information matrix: 



myrkj = -E 



a-L(Vesc,fe) 



(Al) 



deidOj 

where = (v'csc, k) and the right hand side is the negative of the 
expectation value of the second derivative. Note that if Vr is a vec- 
tor of n independent observations, by linearity of expectation the 
information is nl(0\v^)i j. However, we do not include this factor n 
since it is a constant and will not affect the maximum likelihood. 
Using this information matrix, Jeffreys argues that a good reference 
prior can be estimated as, 



PA0) yj\I(0M\, 



(A2) 



where |/(6|Vr)| is the determinant of the information matrix. This 
equation is a useful reference prior because it does not require the 
selection of any specific parametrization. However, it is often ar- 
gued that this form of the prior is improved by considering that the 
parameters are independent of each other (Lee 2004). We follow 
this approach, which requires us to neglect the off-diagonal terms 
in the determinant. By applying this analysis to the LT90 formalism 
(equation |6f we obtain, 

Py(Ve.„ k) OC \ (A3) 

(Vesc - Vmin) V«(k + 2) 

LT90 apply another of Jeffreys' Rules that says if a variable 
varies continuously from to infinity, one must adopt an a pri- 
ori probability distribution equal to the reciprocal of the variable 
(Kendall & Stuart 1977). LT90 adopt this form of the prior proba- 
bility distribution for the escape velocity, Plt{^'s:x) °^ 1/i'esc- How- 
ever, note that Vesc does not necessarily vary between (0,oo). We 
have applied a minimum cut-off velocity, as defined in Section 
13.21 Therefore, an arguably more appropriate approach would be 
to say that the escape velocity varies continuously in (v„„„,oo), i.e. 



Pj'b'esc) l/(vesc - v„,j„). This approach can also be applied to the 
parameter k. From equation ^ we can see that for the LT90 for- 
malism the variable k varies between (- 1, oo) and so one could also 
adopt the prior Pj'{k) oc l/(k + 1). 



A 1.2 A priori probability 

The question remains, which prior probabilities should be used? In 
an ideal world, the samples should be large enough so that maxi- 
mization of equation l|9) would be the same for any chosen prior. 
However, we cannot rely on this fact. It is very important to think of 
Jeffreys' Rules as only guidelines for choosing prior distributions, 
especially when dealing with more than one parameter. Therefore 
we have chosen three priors for our analysis. The first is that used 
by LT90 in their maximum likelihood analysis, which will be la- 
belled as Prior 1. The other two are those derived using Jeffreys' 
rules described above, i.e. Pj and Pj>. These priors are useful be- 
cause they provide much more information about k than the LT90 
prior. 

However, these priors need slight modifications. From equa- 
tion ([TJ it can be seen that k > 0, since negative k would imply that 
the probability does not go to zero as v — > Vesc. Therefore for the 
priors derived from Jeffreys' rules we enforce that the Py = Pj' = 
for < 0. In addition, as can be seen from equation l |A3K prior Pj 
diverges as — > 0. This is clearly unacceptable because our likeli- 
hood only goes to zero as — > -1 (equation [6FI . In order to avoid 
this problem we introduce an additional factor of k/ Vfc + 2 into 
prior Pj, which means that Pj — » as — ^ 0, which is useful for 
continuity at = 0. Importantly, we still retain the property that 
— > as A; — > IX). 

All priors are described in Table lAll These different forms for 
the a priori probability distributions apply different assumptions 
and weights upon the variables. To further assess these prior dis- 
tributions, maximization of equation (O using the parameters v^^c 
and k (refer to Section [Z2] l was performed after applying both of 
the above priors. 



A2 Using simulations to assess the choice of prior 

We tested the choice of prior by applying the bootstrap technique to 
a simulated dataset. A random sample of stars was drawn from the 
assumed LT90 probability density function (equation [6]l for user- 
defined values of Vesc = 600 km s^\k = 2.0, and Vmi„ = 300 km s"' . 
These values are similar to what we might expect for our Galaxy. 
We chose to analyse a random sample size of 35 simulated stars as 
this is comparable to the number of stars in our observed sample 
(Section|4j. This mock sample was used to test the bootstrap max- 
imization methods as well as assess chosen priors in equation ^ 
for calculating the escape velocity and k. Note that we apply the 
LT90 approximation (equation |6]l when calculating the maximum 
likelihood. 

The mock 35 star sample was run through the bootstrap max- 
imization technique for the three priors given in Table IaTI applied 
to equation l|9j. The bootstrap performed maximization on 5000 re- 
samples of the original mock data set. The bootstrap distribution of 
peak likelihood points for Vesc is very close to normal. Therefore, 



Note that the reason why this likelihood does not go to zero as — > is 
because of the integral over the unknown tangential velocities performed in 
equation ^3) 
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we used techniques as described in Section |2j2] to compute confi- 
dence for I'esc- However, the bootstrap distribution for k is clearly 
not normal. We determined that the distribution of peak likelihoods 
follows a chi-squared distribution with the number of degrees of 
freedom equalling the mean of the bootstrap distribution. Therefore 
confidence limits for k were derived from a chi-squared distribution 
instead of a normal distribution. 

Table lAll gives the maximization results for each prior de- 
fined in Section IATI The third column of the table represents the 
maximum likelihood values of v^^^ and k calculated from the origi- 
nal mock sample before the bootstrap has been applied. The fourth 
column give the mean bootstrap values of Vesc and k obtained from 
the bootstrap distribution. The standard errors (SE) were computed 
from the standard deviation of the bootstrap resamples as explained 
in Section |T2l The last column gives the 90 per cent confidence 
intervals of Vesc and k. Note, confidence endpoints for Vesc are com- 
puted using the bootstrap normal-distribution approximation, while 
that for k were computed from a chi-squared distribution. 

Comparing the results from the three different priors, there are 
two very noticeable attributes. Firstly, the bootstrap distribution of 
I'esc from Prior 1 has slightly less bias than that from Priors 2 and 3. 
This suggests that Priors 2 and 3 are 'stronger' priors, introducing 
bias into the maximum likelihood analysis. However, the bootstrap 
mean from Prior 2 gives estimates for v^^^ that are still identical, 
within standard errors, to the chosen values for the simulation. Prior 
3, however, seems to slightly underestimate the value of Vesc. The 
other attribute to notice is that the errors and confidence regions 
from Prior 2 are less than that of Prior 1 , while also still containing 
the desired maximum likelihood values for Vesc and k. Therefore it 
would be useful to use the confidence regions from Prior 2 when 
concluding on the confidence of our actual data sample, especially 
since it contains more information about k. Notice that Prior 3 has 
similar confidence regions to Prior 2, but since the bootstrap mean 
estimate for I'esc is beyond 1 sigma of the assumed value we believe 
that Prior 3 may not be as useful. Therefore we will compute our 
analysis using Priors 1 and 2 for comparison. 

To further assess the reliability of the bootstrap method, we 
repeated our analysis on a large number of different mock samples. 
These mock samples were chosen to cover a wide range of val- 
ues for the input parameters of our distribution (vesc 6 [400, 700], 
k 6 [2, 5]). For only 4 per cent of these samples did the boot- 
strap method produce maximum likelihood values beyond 2a of 
the actual values defined. Otherwise, the technique extracted val- 
ues of Vesc and k that were consistent within standard errors to their 
assumed values in the simulation for all a priori probability defi- 
nitions. Furthermore, the technique was applied using several dif- 
ferent initial guesses for v^^c and k during maximization and the 
conclusions from all priors remained unaffected. 



APPENDIX B: THE ARCHIVAL HIGH- VELOCITY STARS 

Here we list the archival stars that were used to augment the RAVE 
dataset. The stars given in Table lB 1 l are from the Beers et al. (2000) 
catalogue of metal poor stars. 
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Label Prior Form Initial MLE Bootstrap SE 90%-Conf. 

Values Mean Values 

1 ^'LrCVesc, ccl/vesc = 63 1 . 1 km s"' 625.6 132.8 (502.7,832.7) 

k = 2.5 2.7 2.9 (0.3,7.3) 

2 /'y(vesc. k) cc (,,^^^_,,_]f )(t_|,2) iw = 604.6kms-i 588.0 74.1 (516.3,730.6) 

A: = 2.1 2.0 1.5 (0.1,6.0) 

3 Py(vesc. Vesc = 563.5 km s-i 544.0 54.2 (495.9,634.6) 

*:=1.2 1.0 1.1 (0.0,3.8) 



Table Al. Data from the optimization of 35 stars with simulation parameters: I'esc = 600 km s" , k = 2.0, v,,,;,, = 300 kms" after applying our derived priors 
to equation ^9)- The third column gives the maximum likelihood values of I'esc and k for the original (non-bootstrapped) sample, while the fourth column gives 
the mean of the bootstrap distribution of maximum likelihood values. The SE column gives the standard errors from the standard deviation of the bootstrap 
distribution. The last column gives the 90 per cent confidence intervals for Vesc and k. The i'esc confidence intervals are estimated using the bootstrap normal- 
distribution approximation, while the k intervals are calculated from a chi-squared distribution with the number of degrees of freedom equal to the bootstrap 
mean value. Prior 1 is the same as that used by LT90, while Priors 2 and 3 are derived from Jefi"reys' rules. All priors are for ^ 0. Note that the non-bootstrap 
analysis employs Prior 1 . 



Name 


I'l- 




RA (J2000) 


Dec (J2000) 


e 


b 


V 




(kms"') 


(kms-') 


(°) 


(°) 


n 


n 


(mag) 


BPS CS 30339-0040 


337.6 


10 


5.10900 


-36.50553 


336.03 


-78.55 


13.00 


BPS CS 22166-0024 


-341.2 


10 


15.97933 


-12.69761 


134.99 


-75.28 


13.86 


BPS CS 22189-0007 


-307.8 


10 


39.39217 


-12.93944 


188.44 


-61.42 


13.21 


BPS CS 22173-0015 


-320.5 


10 


61.44904 


-17.35106 


211.13 


-44.26 


13.22 


BPS CS 22177-0009 


-307.2 


10 


61.91921 


-25.04522 


221.74 


-46.17 


14.27 


BPS CS 22871-0070 


-311.1 


10 


220.31808 


-18.61142 


336.14 


37.08 


14.76 


BD+01 3070 


-306.2 


3 


230.66700 


1.26469 


3.87 


45.48 


10.06 


BPS CS 30312-0006 


-342.7 


10 


233.24071 


-1.88931 


2.76 


41.49 


13.65 


HD 178443 


336.4 


10 


287.65358 


-43.27658 


354.18 


-21.52 


9.99 


BPS CS 22964-0074 


-340.1 


10 


297.37250 


-39.71078 


0.09 


-27.58 


14.46 


BPS CS 22943-0087 


434.7 


10 


304.82696 


-46.45764 


353.38 


-34.05 


14.24 


V* V1645 Sgr 


-326.3 


14 


305.18525 


-41.11817 


359.80 


-33.68 


11.96 


V* AO Peg 


310.8 


35 


321.77038 


18.63386 


69.94 


-22.58 


12.83 


BPS CS 22948-0093 


396.3 


10 


327.63137 


-41.13033 


0.21 


-50.54 


15.18 


BPS CS 22951-0055 


341.9 


10 


328.68683 


-46.52406 


351.66 


-50.36 


14.78 


HD 214161 


-375.3 


10 


339.28350 


-40.51067 


358.40 


-59.31 


9.10 


BPS CS 22949-0026 


313.2 


10 


350.75979 


-5.21428 


75.14 


-59.61 


15.23 



Table Bl. The archival high velocity stars, taken from Beers et al. (2000). The first column shows the Galactocentric radial velocity. Note that three stai's have 
been excluded from this sample as their estimated distances are greater than 5 kpc; all stai's listed in this table have estimated distances of less than 2.5 kpc 
(see Section l43t . 



